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INTRODUCTION 

In this research, we have proposed the (64, 40, 8) subcode of the third-order Reed-Muller (RM) code 
to NASA for high-speed satellite communications. This RM subcode can be used either alone or as an 
inner code of a concatenated coding system with the NASA standard (255, 233, 33) Reed-Solomon (RS) 
code as the outer code to achieve high performance (or low bit-error rate) with reduced decoding 
complexity. It can also be used as a component code in a multilevel bandwidth efficient coded modulation 
system to achieve reliable bandwidth efficient data transmission. 

This report will summarize the key progress we have made toward achieving our eventual goal of 
implementing a decoder system based upon this code. 

In previous reports [1,6], we described results from our investigations of the complexities of various 
sectionalized trellis diagrams for the proposed (64, 40, 8) RM subcode. We found a specific 8-trellis 
diagram for this code which requires the least decoding complexity with the potential to achieve a 
decoding speed of 600 M bits per second (Mbps). The combination of a large number of states and a high 
data rate will be made possible due to the utilization of a high degree of parallelism throughout the 
architecture. This trellis diagram was presented and described in detail [1]. We then investigated circuit 
architectures to determine the feasibility of VLSI implementation of a high-speed Viterbi decoder based 
on this 8-section trellis diagram. We made detailed design and feasibility examinations of implementation 
approaches for the key blocks. Our key results for block level implementation were presented in [6]. 

This report will focus on our recent progress and plans regarding development of the integrated 
circuit prototype sub-trellis IC, particularly focusing on the design methodology. 
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1. Summary of Previous Results 

We will begin this section with a brief discussion of the system block diagram in which the proposed 
decoder is assumed to be operating. Next, we will present some of the results from our architecture 
development for a sub-trellis IC which will be the basic building block for a decoder system. 

Sv.vrem Block Din? ram 

A simplified block diagram of a receiver in which the proposed decoder may be used is shown in 
Fig. 1. The signal enters the receiver via an antenna and is first amplified by a low noise amplifier (LNA) 
before begin passed to the 2-PSK demodulator. We assume the functions of carrier and timing acquisition 
and gain control are properly performed in the demodulator. The output of the demodulator is sampled at 
the correct phase at the symbol rate of 960 MHz. The output of the sampler is converted to the digital 
domain by the 3-bit analog-to-digital converter (ADC) for decoding by the Viterbi Decoder block which 
follows. Our work currently focuses exclusively on the implementation of the Viterbi Decoder. 



Figure 1 Block diagram of a high speed satellite receiver employing 2-PSK signalling and a Viterbi Decoder. 


Summon/ of System Level Architecture Design 

In our earlier reports [1,6], we describe in detail the different ways in which parallelism can be 
utilized to decode the (64, 40) RM code. We will provide a brief summary of these descriptions in this 
section. 

There are many diverse issues at different levels of the design requiring consideration for 
implementation of the (64, 40) RM code at a rate of 600 Mbits/sec. Fig. 2 illustrates the different layers of 
hierarchy associated with the proposed implementation. First, there are N parallel decoders with each 
operating on a different independent block of 64 symbols. Given a decoder which can decode a 64-symbol 
block at a certain rate, using //decoders and having them each operate on a different block of 64 symbols 
allows a throughput N times greater. Second, each decoder is implemented with K parallel isomorphic 
subtrellises. As described in [5], the trellis for an RM code can be decomposed into parallel isomorphic 
subtrellises that are connected at only the inputs and outputs as shown conceptually in Fig. 2 with K 
parallel subtrellises. This has a tremendous advantage for IC implementation because it minimizes the 
amount of routing required within the trellis which would otherwise be unrealizable at high speed for 
applications requiring large numbers of states. This is the key which makes an implementation using 
CMOS IC’s at such a high rate and complexity possible. And third, there are a number of parameters 
associated with the implementation of each of the K subtrellises. The first is the number of sections in the 

subtrellis denoted as L. Next, is the number of states at the end of each section i (i = 1, 2 L) denoted as 

15-1 which will generally not be the same. Finally, there is the radix of each section denoted as R, for radix 
R in section i. As the number of sections L decreases, the complexity of each section and the number of 
parallel branches per section increases. These trade-offs are discussed in detail in [1]. 





3 



Number of States: Si S 2 S(_/ 2 S^_-i S|_ 

Radix: R, R 2 R|_-i Rl 


Figure 2 Levels of hierarchy in the proposed Viterbi decoder implementation, (a) Parallel Viterbi 
decoders operating on different blocks of data, (b) Implementation with K parallel isomorphic 
subtrellises, (c) Subtrellis implementation. 

After examining a number of various permutations of N, K, L, S, and R, we settled on a solution 
with the detailed structure shown in Fig. 3. We call this structure Trellis 2 and each Viterbi Decoder of 
Fig. 2b will have K equal 32. In this solution, our design goal is to meet the speed objectives in a currently 
available CMOS technology with N equal 2. 
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Figure 3 Detailed subtrellis structure for Trellis 2. 
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The key to the implementation of a (64, 40) RM decoder will be the successful implementation of an 
IC implementing the subtrellis shown in Fig. 3. 

The key objectives of the subtrellis IC implementation are to. 

1 Maximize the efficiency as measured by maximizing the utilization of the hardware (in 
' other words, attempt to minimize the time the majority of the hardware is not being 

used). 

2. Use a chip plan which minimizes the area used for routing (routing area is simply an 
overhead which should be minimized). 

3 Approach the speed of 600 Mbits/sec with 2 parallel decoders. 

4 Consider reliability and robustness issues. In particular, use the lowest speed system 
clock possible which allows high speed operation in order to reduce the number of 
issues which can limit the performance (which in this case would be clock skew 
between chips or race conditions both within and between the different ICs. 

5. Consider the board design and the numbers of inputs and outputs to each chip to 
facilitate implementation of the final decoder system. 

6. Keep the size of the IC on the order of 10 mm per side to facilitate its implementation 

and yield for testing. 


2. Recent Results 

Our recent efforts have focused on Ore design and development of Ihe prototype IC. The goal of ’ tots 
portion of the project is to design and layout the circuits in a computer aided 
database which can be used by a fabrication facility to generate the necessary masks to fabricate 

prototype IC. 

Initial Desig n. Procedure 

The design procedure for development of the prototype IC we have used to date is as follows: 

! Block Level Design - Define the functional performance the major blocks. 

2. Timing Diagram Design - Define the flow of data through the chip based upon the 
design of the major blocks. 

3 Circuit Design - Design circuits first at the gate level and then the transistor level 
‘ depending upon the particular transistor logic style (complementary, pass-gate, 

dynamic, etc...) to perform the desired functions and at the desired speed. 

4 Circuit Layout - Create a full custom layout by hand which defines the location and 

size of the geometries which become the mask set for fabricating the chip. 

5 Verification - Verify that the layout performs the functions desired by the circuit 
' through extracting the connectivity and transistor geometry information and using the 

file as part of an input control file for a circuit simulator, namely SPICE. 

6 Full-Chip Layout - Repeat Steps 3 - 5 outlined above for each of the cells which make 

up the sub-blocks which are then connected to make up the major blocks. Following 
this, perform these steps again as the major blocks are connected and then verified. 

Finally, verify the functionality of the entire layout. 

7. Send a layout file for fabrication to an appropriate facility. 
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Qm Elm - Black Uvd Qieme x. • p io 4 The Clock Generation and Control block will 

An outline of the overall block plan ts shown rn Fig. 4. Tte Clock Ce^ ^ ^ ^ ^ ^ 

generate the necessary clock phases to cloc e c _ e . Se i ect u„it (ACSU). The outputs of the ACS 
which will generate the branch metrics or e branch labels These are input to the Decoder 

data at a 480 Msymbol/sec rate (300 Mbps). 



Figur. 4 (a) Block diagram of die 1C being developed «. implement a suboellis. 

Using the Initial Design Procedure outlined We ulen developed a fayout 

the entire chip. We completed Steps 1 - 3 f ° r e m ^ J k in ^ aCSU and other key building blocks 

for an 8- Way ACS Cell which will be Ihe This allowed us to develop the estimates for 

which will be repeated due to the modularity of ^ simulations assumed the use of a 0.6 Jim 

the chip layout of a subtrellis IC s own in . „ conS evative estimate of the die size in this 

double metal CMOS technology. With pads and routing, a consevat 

technology is 1.2 mm x 1.2 mm. 


itLuie I. 

Block 

* © 

Transistor Count 

Block Size (mm 2 ) 

Clock Generation and Control 

1,000 ^ 

500 (im x 1000 Jim 

Branch Metric Unit 

2,500 

1,000 jun x 1,000 jtm 

Add-Compare-Select Unit 

275,000 

8,000 jtm x 11,000 Jim 

Decoder 

175,000 

2,500 jtm x 11,000 jtm 
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Umlrnkm aU te IniM Umm Emmiurs. . 

The procedure outlined above was very useful for us to obtain estimates of the size and complexity 
of each of the blocks. However, it underscored a basic limitation of our Initial Design Procedure which is 
the adequate full-chip verification of the layout of a chip with nearly 500,000 transistors, ms procedu 
Tas used successfully in other university projects in high speed decoders [2-4], However, the overall chip 
“Scy S our project has Putted out to be significantly greater than that of this other prevtous work. 


f.crTnoirfSfm lose. CaUf o min) Association, FelaHon^ip , audJimm 

LSI Logic is a company based in San Jose. California focused on integrated circuit development for 
high performance communication systems. One of the primary products of LSI Log.c ts a des gn 
methodology which allows customers to design and develop custom integrated circuits in » sy^mattc am^ 
proven manner. Using this LSI Logic design methodology, customers begin with a set 
specifications and end with prototype IC devices in near state-of-the-art CMOS technologies. 

This summer, our two students involved with the development of the prototype IC. Eric Nakamura 
and CeciliaChu, Ire spending the summer at LSI Logic as Temporary Employees. As a result of me 
combination of coding and VLSI development research work here at the University of Hawaii. g 

hasstarted what is planned ,0 be a long term support relationship of our University n relaho^htp 
LSI Logic will supply their design methodology and chip fabrication services to the University 
mmmmfo"h y „pda,es (as is aval, able ,0 all companies). While mere are other benefits to LSI Log 
such as me potential for student hires through internship experiences, research updates will pr y 
Cen in a mler more time,, man might be consider* typical due to me 
faculty and students. In the longer term, LSI Logic plans to support increasing amounts 
coding and VLSI development here at the University. 

Our students this summer are focusing their efforts on learning the LSI Logic methodology in e 
conte^ of development project for LSI Logic as they are hired as Temporary Employees. In the coming 
“ dll firms back » the University the LSI Logic methodology which we plan to use for 
development of me prototype subtrellis IC. The advantages for our current project are three-fold. 

First LSI Logic has currently available a 0.35 pm CMOS technology which would result in a nearly 
a 4 tim« i 'tiond me layout size of a given ceil. As a result, we would probably tend toward me use o 
standard cells from the LSI Logic family as opposed to a full custom hand design as we had planned. This 

takes us to the second point. In the LSI Logic design methodology. dSnbe aTven 

Hardware Description Language (HDL). This programming language is first used to describe a g 
“and wScan be seated and verified. Tbis HDL can men be used to directly develop a circuit 
layout which would be a connection of standard cells from a cell library which "Ould 'mplemm me 
described function. This will greatly reduce development time as compared with our Initial Design 
Procedure. While the chip area is greater when standard cells are used to implement a §' ven " C 
compared with a full custom hand design 1 , the use of a more aggressive technology can stUL result in a 
decrease in the overall chip size. And third and most important, the LSI Logic design methodology 
been used effectively to develop integrated circuits with more than one million transistors This prove 
method Xrea^y increase the probability of working devices in our initial design. It win allow us to 
verify the circuit layout with a much greater confidence as compared with the Initial Design Procedur . 

We expect that the design results from the Initial Design Procedure, which provided an initial design 
of the^roX IC through to the circuit level, will be extremely beneficial to our development using the 

I ~Handdesign signifies a layout drawn by hand on a computer as opposed to one automatically generated on a computer. 
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LSI Logic design methodology. We are aware of the critical blocks and now have points of reference from 
both the circuit and layout standpoints with which to compare our new designs. This should result in a 
superior solution than would otherwise have been obtained if either of the design approaches were used 
exclusively. 


3. Summary 

In our recent efforts, we completed the majority of the circuit design down to the circuit level using 
what we call our Initial Design Procedure. Through this process, we believe a prototype IC which can be 
used to implement the 600 Mbps decoder is achievable using a 0.6 pm CMOS technology using the 
approaches described in our previous report. This summer we have our two students Eric Nakamura and 
Cecilia Chu at LSI Logic, learning the LSI Logic Design Methodology which they plan to bring back to 
the University. We believe the LSI Logic design methodology will result in a circuit layout whose 
performance can be better verified prior to fabrication than would otherwise be possible using our Initial 
Design Procedure. Thus, the initial prototype circuits will have a much greater chance of functionality. LSI 
Logic intends to be involved with the fabrication of the prototype IC using their 0.35 pm CMOS 
technology. This newly developed association between LSI Logic and the University of Hawaii should 
prove to be very usefiil as we progress in our development of the prototype IC on our way to building a 
prototype decoder system. 
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